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Abstract tigation establishes a theoretically sound limit on the 

amount of information contained in the PDF, which 
We have systematically studied the optimal real- has ramifications towards how PDF data are mod- 
space sampling of atomic pair distribution data by eled. 
comparing refinement results from oversampled and 
resampled data. Based on nickel and a complex 

perovskite system, we demonstrate that the opti- 1 Introduction 

mal sampling is bounded by the Nyquist interval de- 
scribed by the Nyquist-Shannon sampling theorem. Atomic pair distribution function (PDF) analysis of 
Near this sampling interval, the data points in the x-ray and neutron powder diffraction data is becom- 
PDF are minimally correlated, which results in more ing prominent in structure analysis of complex mate- 
reliable uncertainty prediction. Furthermore, refine- rials due to an increasing interest in studying struc- 
ments using sparsely sampled data may run many ture from nanoscale structural order. [1] Dedicated 
times faster than using oversampled data. This inves- experimental facilities are appearing for PDF stud- 
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ies [3] as well as specialized software. [U [5J [5] 
As more people in the structure-characterization 
community adopt the PDF method, it is important 
to reevaluate and strengthen our analysis techniques. 
To this end, we have investigated the information 
content in the PDF data allowing us to determine 
optimal grid spacings to use when calculating PDFs. 
The sampling grid for PDFs is typically chosen in an 
ad-hoc way, for example, to give a visually smooth 
PDF. The information content in the PDF does not 
increase for grid intervals above a critical value. If the 
data are oversampled, not only is no new information 
introduced, the points in the PDF are not statisti- 
cally independent, [HI 1 1 Oj which leads to improper 
estimates of uncertainties in refinement parameters 
and slowing down the refinement. [11) 

We have systematically studied the optimal PDF 
sampling interval for PDF data and demonstrate 
that it is consistent with the value predicted by the 
Nyquist- Shannon sampling theorem. [T2] This gives 
the minimum amount of information we need to com- 
pletely specify a PDF from a given F(Q). When 
this optimal sampling is enforced, we see significant 
speed-up in our PDF refinements accompanied by a 
small increase in estimated uncertainties due to the 
reduction of statistical correlations among the PDF 
points. When the data are made sparser than the op- 
timal sampling interval the refinement results rapidly 
become unreliable due to aliasing. 

2 The PDF method 

The PDF method is a total scattering technique for 
determining local order in nanostructured materi- 
als. [TU] The technique does not require periodic- 
ity, so it is well suited for studying nanoscale fea- 
tures in a variety of materials. (T3l [14] The experi- 
mental PDF, denoted G(r), is the truncated Fourier 
transform of the total scattering structure function, 
F(Q) = Q[S(Q)-1]: [IS] 

G(r) = - / F(Q) sin(Qr) dQ, (1) 

where Q is the magnitude of the scattering momen- 
tum. The structure function, S(Q), is extracted from 



the Bragg and diffuse components of x-ray, neutron 
or electron powder diffraction intensity. For elastic 
scattering, Q = 47rsin(#)/A, where A is the scattering 
wavelength and 20 is the scattering angle. In prac- 
tice, values of Qmin and Q max are determined by the 
experimental setup and Q max is often reduced below 
the experimental maximum to eliminate noisy data 
from the PDF since the signal to noise ratio becomes 
unfavorable in the high-Q region. 

The PDF gives the scaled probability of finding two 
atoms in a material a distance r apart and is related 
to the density of atom pairs in the material. |10j For 
a macroscopic scatterer, G(r) can be calculated from 
a known structure model according to 

G(r) = 4irr [p{r) - p ] , (2) 

Here, po is the atomic number density of the material 
and p(r) is the atomic pair density, which is the mean 
weighted density of neighbor atoms at distance r from 
an atom at the origin. The sums in p(r) run over all 
atoms in the sample, hi is the scattering factor of 
atom i, (b) is the average scattering factor and r,j is 
the distance between atoms i and j. 

In practice, we use Eqs.[2]to fit the PDF generated 
from a structure model to a PDF determined from 
experiment. For this purpose, the delta functions in 
Eqs. [5] are Gaussian-broadened and the equation is 
modified to account for experimental effects. PDF 
modeling is performed by adjusting the parameters 
of the structure model, such as the lattice constants, 
atom positions and anisotropic atomic displacement 
parameters, to maximize the agreement between the 
theoretical and an experimental PDF. This procedure 
is implemented in PDFgui, [3] which is the program 
used in this study. PDFgui uses the Levenberg- 
Marquardt algorithm [121 HZ] to locally optimize the 
model structure. The algorithm also provides esti- 
mates of uncertainties on those parameters upon con- 
vergence, though strictly the estimates are only ac- 
curate if the data are independent and the statistical 
errors are Gaussian distributed and properly deter- 
mined. [11] 
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3 The Nyquist-Shannon sam- 
pling theorem 

The Nyquist-Shannon sampling theorem specifies an 
upper bound on the sampling interval of a discrctized 
signal in the time domain such that the sample con- 
tains all the available frequency information from the 
signal. This upper bound is tt/Auj, where Aw is 
the angular frequency bandwidth of the signal. |12j 
The quantity tt/Auj is commonly referred to as the 
Nyquist interval. A continuous or discrete signal sam- 
pled on a grid finer than the Nyquist interval can be, 
in principle, perfectly reconstructed via interpolation, 
since the sampling does not compromise the informa- 
tion content of the signal. 

In relation to the PDF, the angular frequency do- 
main is Q-space and we are interested in sampling in 
r-space, the analogue of the time domain. The fre- 
quency information is specified by F(Q) (see Eq. [I]), 
which has bandwidth <5 ma x0 This gives a Nyquist 
interval of 

dr N = 7r/Qma X - (3) 

The sampling theorem states that the PDF can be 
sampled on any grid with intervals shorter than this 
without losing any information from F(Q). 

Whittaker [18] and Shannon [12] describe an in- 
terpolation formula for reconstructing a signal from 
samples taken on a grid with interval, dr, less than 
the Nyquist interval. In terms of the PDF, the re- 
construction formula is 



G'(r) = J2 G ( ndr ) 



sin(7r(r/dr — n)) 
Tt(r/dr — n) 



(4) 



where n iterates over the points of the sample. Later 
we will demonstrate the benefits of modeling the PDF 
on an optimally sampled grid. This formula allows 



1 The sampling theorem as presented in Shannon's paper 
deals with signals having positive and negative frequency com- 
ponents. The bandwidth is defined as the maximum absolute 
frequency value. Mathematically, F(Q) is an odd function 
(see Eq. 15 in |15|). a fact we use when transforming F(Q) to 
G(r) (Eq. [TJ. The "full" spectrum of F(Q) that includes the 
negative-frequency branch can be calculated from the positive- 
frequency branch, and spans the range [— Q max , Qmax]. Qmin 
does not enter into this since we enforce F(Q < Q m j n ) = 
during modeling. 1151 



us to interpolate a model PDF onto a denser grid, 
e.g. for convenient visual inspection. In practice, the 
sampled data must extend beyond the desired range 
to avoid reconstruction errors in the high-r region. 

3.1 Aliasing 

Sampling G(r) at or coarser than the Nyquist inter- 
val results in aliasing. This term refers to how, in 
undersampled data, high Q information in F(Q) can 
masquerade as intensity at lower Q. This is demon- 
strated for the PDF by considering its Fourier series 
over — r max < r < r max . We choose this range be- 
cause it lets us consider the sine-Fourier series (G(r) 
is odd) and because the PDF over this range contains 
the same information as the PDF over < r < r max . 
Now, 



G(r) 



m— 1 



b m sm(Q r 



where Q m = mir/r max . Since G(r) contains no fre- 
quency components greater than Q max , Q m < Q max , 
and thus m max < Q max r max /7r. 

Consider the m th term of the series sampled on 
the interval dr = n/Q', where Q' and m are cho- 
sen such that Q' < Q m < Q max . For the n th 
sample, the contribution to the Fourier series is 
b rn sin(ndr Q m ). Given the relationship between Q m 
and Q' , ndrQ m > n(it /Q')Q' = nu. Thus, we can 
represent the argument as (Q m — 2Q') ndr + 2mr, 



so that the 



frequency component of the sam- 



ple looks like —b m sin((2<5' — Q m )ndr) for all n. The 
contribution to G(r) from F(Q) at Q = Q m there- 
fore appears in G(r) as if it came from Q = 2Q' — Q m 
in F{Q). Consequently, in F(Q) the signal above Q' 
gets "folded" back to lower Q and overlaps with the 
signal in the range 2Q' — Q max < Q < 2Q'. This ex- 
plains how information in F{Q) is progressively lost 
in G(r) if G(r) is calculated on grids that are too 
coarse. The more undersampled the data, the greater 
the Q-range that is folded back and the greater the 
loss of information in G(r) due to overlapping signals 
from different Q-values. The effect is illustrated in 

Fig.[U 

We note that the case where the data are sam- 
pled precisely on a grid with the Nyquist interval, 
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t < 1 < r 
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Figure 1: Demonstration of aliasing in F(Q). (Top) 
Experimental nickel F(Q) with Q max = 29.9 A 

featuring regions above and below Q' — 20.9 A . 
(Center) Experimental nickel F(Q) with the region 
above Q' "folded" over to lower Q. (Bottom) Aliased 
F(Q) obtained by sampling the PDF from the exper- 
imental F(Q) on a grid with interval dr = 0.15 A and 
Fourier transforming back to F(Q) (solid line). This 
sampling interval is larger than the Nyquist interval 
(dr N = 0.105 A) and corresponds to Q' = n/dr = 
20.9 A . Overlaid is the F(Q) obtained by adding 
the unfolded and folded segments of the experimental 
F(Q) (dashed line). Note that the Q-axis starts at 

io A" 1 . 



dr = drN, then Q' = Q m = Q max and there is no 
folding. However, there is still loss of information 
since s'm(Q m ndr) = 0, and so the m th Fourier am- 
plitude, b m , can take on any value. This is why a 
strict inequality between the sampling interval and 
the Nyquist interval is required to avoid aliasing: 
dr < drN- 

Aliasing implies that the sampled signal does not 
uniquely identify its source. Since some frequency 
components alias others, the PDF could represent 
the aliased F(Q) just as well as the unaliased one. 
When back-Fourier transforming a sparsely sampled 
G{r) into Q space, the aliased F{Q) will result. The 
sampling theorem states that aliasing does not oc- 
cur when sampling at an interval smaller than the 
Nyquist interval. 

3.2 Structural Information in the 
PDF 

The sampling theorem determines the number of data 
points required to reconstruct a PDF signal from 
samples, which is 

N = Ar/dr N = ArQmax , (5) 

where Ar is the extent of the PDF in r-space. What 
is more relevant to PDF modeling is the amount of 
structural information in the PDF. N is an upper 
bound on this since we cannot extract more inde- 
pendent observations of the structure than raw in- 
formation from the signal. Given perfect data and 
the proper model, one can meaningfully extract N 
structural parameters from a PDF signal. 

Factors such as noise and peak overlap can obscure 
the structural information in the PDF and there- 
fore determine whether N is a good estimate of the 
amount of structural information in the PDF. For ex- 
ample, consider a situation where the PDF contains 
a single peak, but has a very large <2 max - In this 
case, a complete crystal model cannot be obtained 
from fitting this single peak, no matter how large N 
is. In another extreme case, imagine that the ma- 
jority of PDF peaks have a single point or no points 
due to a small Qmax- In this situation the anisotropic 
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displacement parameters cannot be determined with 
certainty. 

In practice, the amount of structural information 
in the PDF cannot be precisely known. To perform a 
reliable refinement, the signal-to-noise ratio must be 
favorable, [TU] the PDF peaks must be apparent, and 
the fit range must be such that the structural features 
one is seeking to model are accessible. In addition to 
this, we recommend using Rietveld refinement guide- 
lines when refining the PDF, which advise that the 
ratio of independent observations to the number of re- 
finement parameters should be around three to five, 
preferring the latter. [19] 

4 Method 

Powder diffraction data were collected from nickel 
and LaMnC>3 (LMO) samples. The nickel data were 
collected using the rapid acquisition pair distribution 
function (RaPDF) technique [20] with synchrotron 
x-rays on beamline 6-ID-D at the Advanced Photon 
Source at Argonne National Laboratory. The sam- 
ple was purchased from Alfa Aesar. The powdered 
sample was packed in a flat plate holder with thick- 
ness of 1.0 mm and sealed between Kapton tapes. 
Data were collected at room temperature in trans- 
mission geometry with an x-ray energy of 98.001 keV 
(A = 0.12651 A). An image plate camera (Mar345) 
with diameter of 345 mm was mounted orthogonally 
to the beam with a sample to detector distance of 
178.4 mm. 

The raw 2D data were reduced to ID integrated 
intensity profiles using the Fit2D program. [21] 
Corrections for environmental scattering, incoher- 
ent and multiple scattering, polarization and ab- 
sorption were performed according to the standard 
procedures [10] using PDFgetX2 [5] to obtain the 
PDF with Qmax = 29.9 A . This corresponds to 
dr N = 0.105 A. 

The LMO data were collected using time-of-flight 
neutron diffraction at the NPDF instrument at the 
Los Alamos Neutron Scattering Center at Los Alamos 
National Laboratory. The LMO sample preparation 
and data collection have been described in detail 
elsewhere. [22] The LMO PDFs were produced with 



PDFgetN [7J using Q max = 32.0 A \ This corre- 
sponds to dr N = 0.0982 A. 

In each case, experimental PDFs were generated 
with r max = 20 A using dr = 0.01 A. PDF data on 
sparser grids were created by removing points from 
this PDF in order to get the desired sampling inter- 
val. Pruning the data in this way is equivalent to re- 
calculating the PDF from F(Q) on the sparser grid. 
We produced 31 data-sets with varying dr against 
which models were refined. 

We took as a reference data-set the PDF gener- 
ated on the default grid of dr = 0.01 A and struc- 
tural models were refined to the data. We then re- 
fined the same models to data-sets on sparser grids. 
We define as A p (dr) for a parameter p as the ab- 
solute difference between the value of the param- 
eter p refined for the data-set sampled at interval 
dr and that refined for the reference data-set. The 
accuracy of the refined parameters becomes unac- 
ceptable when A p (dr) exceeds the statistical uncer- 
tainty on the difference, a(A p (dr)). This is given 
by a(A p (dr)) = ^fcj 2 (p{dr)) + cr 2 (p(0.01)), where 
a(p(dr)) and <r(p(0.01)) are the estimated uncertain- 
ties on parameter p taken from the refinement for 
the data-set sampled at interval dr and the refer- 
ence data-set, respectively. To determine if a re- 
fined parameter extracted from a sparse data set 
is accurate, we define a parameter quality factor, 
Q p (dr) = A p (dr) / a(A p (dr)) . If Q p (dr) is less than 
or equal to one, the parameter value refined from the 
data-set sampled at interval dr is within the expected 
uncertainty of the best estimate and is considered ac- 
curate. If Q P {i) is greater than one, the change in the 
parameter's value is greater than the expected uncer- 
tainty, and the result is considered unreliable. 

The parameter quality measure, Q P {i), is biased 
due to a couple of assumptions. First, by compar- 
ing all results with the refinement of the undiluted 
data we assume that this refinement gives the best 
estimate for each parameter. The validity of this as- 
sumption is dependent on the systematic bias of the 
refinement results due to the quality of the data and 
the suitability of the refinement model. Since this 
bias is present in the diluted data as well, its effects 
should be negligible. Second, we assume that the un- 



5 



certainty value derived from the refinement results 
is accurate. We discuss later that the uncertainty 
values derived from refinements of oversampled data- 
sets are too small. This inflates the estimated quality 
factor when the data are oversampled, but does not 
invalidate the accompanying results. 

The refinements from unaltered and sampled data- 
sets were performed identically over a range from 
7'min = 0.01 A to r max = 20.0 A using the pro- 
gram PDFgui. [4] For the nickel data, the lattice 
parameter, isotropic atomic displacement parameter 
(ADP), dynamic correlation factor, scale factor and 
resolution factor were varied in the refinements. In 
the LMO fits, three lattice parameters, four isotropic 
ADPs (one each for the La, Mn and axial and pla- 
nar oxygen atoms), and seven fractional coordinates 
were varied along with the scale and correlation fac- 
tors (see [13]). From Eq. [5] we get that refinements 
over this range, Ar = 19.99 A, yield Ntfi — 191 and 
Nlmo = 203. For the nickel data set, we have an 
observation-to-parameter ratio (OPR) greater than 
30 and for LMO the OPR is greater than 10. The re- 
finements are therefore comfortably over-constrained 
and the optimization problem is well conditioned. 

Various refinements were timed to measure the 
speed-up in the program execution due to sampling. 

5 Results 

When the nickel and LMO data are made sparser, 
the PDF profiles appear less smooth and the detailed 
shape of the peak profiles becomes less apparent. 
This is shown in Figs. [2] and [3] The data in panel (a) 
in both figures are on the reference grid (dr = 0.01 A) 
and are both smooth and have well-defined Gaussian- 
like peaks. [10] The data in panel (b) are sampled 
with dr = 0.1 A, close to the Nyquist interval, and 
are not nearly as smooth, though the peaks are still 
well defined. Lastly, the data in panel (c) are sam- 
pled with dr = 0.3 A, where there is apparent loss 
of information. The refined parameters from these 
fits are given in Table [JJ and Table [5] Note that the 
uncertainty in the refined parameters increases from 
dr = 0.01 A to dr = 0.1 A, although each of these 
data-sets produce acceptable results. 
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Figure 2: Fits to sampled nickel PDFs. (a) Unal- 
tered data with dr = 0.01 A. (b) Sampled data with 
dr = 0.1 A. (c) Sampled data with dr = 0.3 A. 
The data are shown as circles, the fits are the lines 
through the data and the difference is shown offset 
below. All fits are of similar quality, despite the poor 
visual quality of the data in panels (b) and (c). The 
data shown in panel (c) is undersampled and pro- 
duced unacceptably uncertain results, though this is 
not apparent from the difference curve. 
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Figure 3: Fits to sampled LaMn0 3 PDFs. (a) Un- 
altered data with dr = 0.01 A. (b) Sampled data 
with dr = 0.1 A. (c) Sampled data with dr = 0.3 A. 
The data are shown as circles, the fits are the lines 
through the data and the difference is shown offset 
below. All fits are of similar quality, despite the poor 
visual quality of the data in panels (b) and (c). The 
data shown in panel (b) and (c) are undersampled, 
and the data in panel (c) produced unacceptably un- 
certain results. Note that in panel (c) several peaks 
are not resolved. 



Table 1: Parameters from Ni refinements using 
data with various dr. The Nyquist interval, cfrjv, 
is 0.105 A. Here, a denotes the lattice parameter, 
Uiso the isotropic ADP, 82 the vibrational correla- 
tion parameter, scale the data scale and Qdamp the 
experimental resolution factor. 



dr(A) 


0.01 


0.10 


0.12 


0.30 


a(A) 


3.53159(2) 


3.53158(6) 


3.53158(6) 


3.53186(10) 


u tso (k 2 ) 


0.005446(7) 0.00545(2) 


0.00543(2) 


0.00570(4) 


<5 2 (A 2 ) 


2.25(2) 


2.20(5) 


2.15(5) 


2.2(2) 


scale 


0.7324(7) 


0.733(2) 


0.734(3) 


0.761(4) 


Qdamp 


_1 ) 0.06307(11) 


0.0632(4) 


0.0634(4) 


0.0653(7) 



Table 2: Parameters from LaMnC>3 refinements us- 
ing data with various dr. The Nyquist interval, dr^, 
is 0.0982 A. Here, a, b and c denote the lattice param- 
eters, Uiso the isotropic ADP (one for each primitive 
atom), x, y and z the fractional atomic coordinates, 
62 the vibrational correlation parameter and scale the 
data scale. 



dr(A) 



0.01 



0.10 



0.12 



0.30 



6(A) 

c(A) 
<5 2 (A 2 ) 

scale 
La 

x 

y 

t/ 8SO (A 2 ) 

Mn 

u lso {k 2 ) 
o l 

x 

y 

u lso (k 2 ) 
o 2 

x 

y 

z 

u lso (k 2 ) 



5.5394(2) 
5.7441(2) 
7.7059(2) 

2.44(3) 
0.7941(11) 



5.5394(6) 

5.7443(7) 

7.7059(9) 

2.38(9) 

0.794(3) 



5.5393(7) 5 

5.7442(8) 5 

7.7054(10) 7 

2.35(9) 2 

0.795(4) 



0.99234(10) 0.9923(3) 0.9926(4) 
0.04828(8) 0.0482(2) 0.0481(3) 
0.00508(4) 0.00506(13) 0.0052(2) 



5362(14) 

7536(13) 

697(2) 

49(14) 

803(6) 

9917(6) 
0469(5) 
0055(2) 



0.00376(7) 0.0038(2) 0.0038(2) 0.0024(3) 



0.07300(11) 0.0730(4) 
0.48625(10) 0.4862(3) 
0.00682(8) 0.0067(3) 



0.72515(8) 
0.30682(8) 
0.03876(6) 
0.00689(4) 



0.7251(2) 
0.3068(3) 
0.0388(2) 
0.0069(2) 



0.0730(4) 
0.4864(4) 
0.0068(3) 

0.7252(3) 
0.3069(3) 
0.0389(2) 
0.0068(2) 



0739(7) 
4874(7) 
0075(3) 

7247(5) 
3072(5) 
0399(3) 
0062(2) 
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In Fig. @] we show the parameter quality values, 
Q p {i), plotted against the sampling interval. The 
quality factor is satisfactory for data-sets that are 
sampled with grids close to the reference data-set. 
This indicates that these refinements are producing 
the same parameter values. Near the Nyquist inter- 
val (indicated by the vertical dashed line), various 
quality factors become unacceptable. 

6 Discussion 

Figure 0] indicates that the onset of unreliable refine- 
ments coincides with the Nyquist interval. The re- 
fined parameter values are all acceptable, and largely 
independent of the sampling interval in the oversam- 
pling region (dr < drjv). Figures [H and 0] indicate 
that visual appearance is not a good indicator of data 
quality. 

The sampling theorem tells us that the informa- 
tion content in the data does not change as long as 
we sample on a grid finer than the Nyquist interval. 
We expect to and do refine the same parameters from 
such samples. As the data are sampled onto grids 
coarser than the Nyquist interval, we expect to lose 
structural information gradually. In contrast, refined 
values of the parameters become unreliable quickly as 
the Nyquist interval is exceeded. This is somewhat 
surprising since the refinements are highly overcon- 
straincd and have an estimated OPR greater than 5 
even when sampled at twice the Nyquist interval. In 
Fig. |4] we see the quality of the refined parameters 
diverge well before this point. Intuition would tell us 
that it is possible to lose a considerable quantity of 
information by sampling before refinements become 
unstable. This is not observed. The instability is not 
caused solely by information loss, but by information 
corruption due to aliasing. 

Aliasing has two effects on a PDF signal, as de- 
scribed in Section IXT1 Foremost, aliasing lowers the 
effective maximum Q-value in F(Q) from Qmax to 
Q' = it I dr. This creates the obvious effect of lower 
resolution in the PDF, as seen in Figs. [2] and [3] In ex- 
treme cases, this will lead to poorly defined peaks in 
the PDF. Less obviously, sampling on a grid coarser 
than the Nyquist interval allows for the possibility 




dr(A) 



Figure 4: Refined parameter quality (open symbols) 
and refinement times (solid circles) measured using 
sampled Ni (top) and LaMnC>3 (bottom) data. The 
dotted horizontal line shows the cutoff between ac- 
ceptable and unacceptable parameter quality. The 
dashed vertical line shows the value of dr^ predicted 
by the sampling theorem. For dr values larger than 
this the quality of some parameters transition into the 
unacceptable region. The time values demonstrate 
the decrease in refinement time with increasing dr, 
with more than a seven- fold speed up near drpj- The 
solid curve through the time values is fit to the form 
a + b/dr. 
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that the PDF has originated from a different, aliased, 
F(Q) as shown in Fig. [I] When calculating the model 
PDF, we enforce F(Q > Q max ) = 0. When there is 
aliasing the structure function resulting from G(r) 
has F(Q > ir/dr = 0), and extra intensity below 
ix J dr. Thus, aliasing makes it possible to find a dif- 
ferent set of refinement parameters that describes the 
sampled PDF. This is true regardless of the optimiza- 
tion algorithm. 

The estimated uncertainties on the fitting param- 
eters for dr in the region of stable refinements are 
dependent on the sampling interval. We see from 
Tables [1] and [2] that the uncertainties on the parame- 
ters increase when estimated from the data sampled 
near the Nyquist interval compared to the reference 
data. The sampling theorem gives the number of data 
points necessary to fully represent the PDF. Any data 
sampled on a grid finer than the Nyquist interval are 
necessarily redundant. If a set of fitting parameters 
reproduces a particular set of points well on a op- 
timal grid, those parameters will also reproduce the 
associated redundant points well. By not taking into 
account the correlations between data points, [9] as 
in this study, this results in under-estimated uncer- 
tainty values on parameters. Refining optimally sam- 
pled data reduces these correlations while retaining 
all the structural information available in the data 
and gives a more reliable estimate of uncertainties. 

A fortunate side-effect of refining optimally sam- 
pled data is a decreased refinement time. Shown in 
Fig. 2] is a plot of refinement times for some chosen 
sampling intervals. The trend in the plot shows that 
refinement time is proportional to the inverse of dr 
(shown as the broad solid line), or directly propor- 
tional to the number of data points, with a constant 
offset. This trend reflects the fact that the calculation 
of the PDF grows linearly with the number of sample 
points. Carrying out refinements on optimally sam- 
pled data gives a significant speed increase compared 
to the reference data; in this case the speed increases 
by more than a factor of seven. 

These observations indicate that PDF refinements 
should be performed on the sparsest grid possible 
with sampling interval less than the Nyquist inter- 
val. To produce an esthetically pleasing presenta- 
tion of the PDF, one can interpolate onto a finer grid 



using the Whittaker-Shannon interpolation formula 
(Eq.Hl). 

7 Conclusions 

The purpose of this research was to demonstrate the 
consequences of the Nyquist-Shannon sampling the- 
orem as it applies to the PDF. We show that the 
quality of refined parameters diverges when sampling 
the PDF at intervals larger than the Nyquist interval, 
which is the result of aliasing. Furthermore, we show 
that the estimated uncertainties of refined parame- 
ters are more reliable when the PDF is optimally sam- 
pled. Statistically reliable uncertainties on refined 
parameters can be obtained by taking into account 
the correlations between all the points in G(r), [9] 
but this comes at the computational expense of in- 
verting a large error matrix. By optimally sampling 
the PDF, the correlations among points in the PDF 
are minimized while preserving all the available struc- 
tural information. This gives improved uncertainty 
estimates without costly computation, and may ex- 
pedite refinements when the PDF can be computed 
over fewer points. 

The Nyquist-Shannon sampling theorem gives an 
upper bound on the amount of structural informa- 
tion contained in an experimental PDF. This deter- 
mines the Q- and r-extent that are required for a 
model refinement to be overconstrained. Oversam- 
pling the PDF does not add more information to a 
refinement, and therefore provides no benefit other 
than an esthetically pleasing visualization. This re- 
sult emphasizes the importance of collecting diffrac- 
tion data to high Q when it is to be used for PDF 
modeling, since a larger Q max decreases the Nyquist 
interval, and makes accessible smaller structural de- 
tails. 
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